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UNIVERSALITY OF THE MEAN-FIELD FOR THE POTTS MODEL 


ANIRBAN BASAK* AND SUMIT MUKHERJEEt 


Abstract. We consider the Potts model with q colors on a sequence of weighted graphs with 
adjacency matrices allowing for both positive and negative weights. Under a mild regularity 
condition on An we show that the mean-field prediction for the log partition function is asymptoti¬ 
cally correct, whenever tr(A^) = o{n). In particular, our results are applicable for the Ising and the 
Potts models on any sequence of graphs with average degree going to -|-cx3. Using this, we establish 
the universality of the limiting log partition function of the ferromagnetic Potts model for a se¬ 
quence of asymptotically regular graphs, and that of the Ising model for bi-regular bipartite graphs 
in both ferromagnetic and anti-ferromagnetic domain. We also derive a large deviation principle 
for the empirical measure of the colors for the Potts model on asymptotically regular graphs. 


1. Introduction 

One of the fundamental models in statistical physics is the nearest neighbor q-state Potts model. 
For a finite undirected graph G := {V,E), with vertex set V, and edge set E, the Potts model is a 
probability measure on with [q] := {1,2,- •• ,q}, where | • | denotes the cardinality of a set. 
The probability mass function for the Potts model at y := {yi,i G V} is given by 

;=---l_exp|/3 X (1.1) 

Here 6{y,y') = ^y=y', and Zq((3,B) is the normalizing constant, which is commonly termed as 
the partition function. The parameters /3 and B are known as inverse temperature parameter and 
external magnetie field parameters respectively, with /3 > 0 is said to be the ferromagnetic regime, 
and /3 < 0 is the anti-ferromagnetic regime. When q = 2, the measure is the well known 

Ising measure. 

Although Ising and Potts models originated from statistical physics [34, 41], due to its wide 
applications it has received a lot of recent interest from varied areas, including statistics (cf. [1, 
5, 15, 42] and references therein), computer science (cf. [4, 12, 31, 44] and references therein), 
combinatorics, finance, social networks, computer vision, biology, and signal processing. Potts 
models on graphs also have connections with many graph properties, such as the number of proper 
colorings, max cut, min cut, min bisection (cf. [2, 10, 11, 22] and references therein), which are of 
interest in classical graph theory. One of the main difficulties in the study of the Ising and the 
Potts model is the intractability of its partition function. If the partition function were available 
in closed form, one could analyze it to compute moments and limiting distributions, carry on 
inference in a statistical framework using maximum likelihood, or compute thermodynamic limits 
of these models which are of interest in statistical physics. As the partition function involves 
summing the unnormalized mass function over exponentially many terms, computing the partition 
function numerically or otherwise is challenging in general. Since exact computations are infeasible. 
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they are broadly two approaches to tackle this problem. A branch of research is directed towards 
devising efficient algorithms to approximate the log partition function (cf. [35, 46], and the references 
therein). Whereas, probabilists are interested in studying the asymptotics of the log partition 
function for sequence of graphs for large n (cf. [23, 24, 28, 29] and references therein), in an 
attempt to understand these measures. More precisely, considering a sequence of graphs G„ := 
with growing size, the goal is to compute the asymptotic limiting log partition function 
B), where 

m3,B) := hm 

n—)-cx> Tl 


and $n(/3) B) := log B). To get a non-trivial value of ^{fi,B) one must scale /3 appropriately 

depending on \En\. In particular, the inverse temperature parameter in (1.1) should be replaced by 
(3n ■= {n/2\En\)f3 for the Potts model on G„. This scaling ensures that ^{j3,B) is not a constant 
function for all choices of /3, and B. By a slight abuse of notation we denote this measure by 



One common scheme of approximating <I'„(/3,i?) is via the naive mean-field method. Mean-field 
method has been in the statistical physics literature for a long time (see [14, 38]). Below we describe 
the mean-held method in our context in detail: 


1.1. Mean-field method. Let V{[q]'^) denote the space of probability measures on [g]”. For any 
two measures £ T’([<?]”) define the Kullback-Leibler divergence between /x and n by 

:= fi{y) log fi{y) - ^ fi{y)logn{y), 
ye[q]^ ye[q]^ 


where OlogO = 0 and logO = — oo by convention. 

Then, for any q G 'P{[q]"') an easy computation gives 

= ^n{l3,B)+ q{y)logq{y)- Y ^iy)Hi’^{y), 


where 

{y) ■■= fin Y 

{i,j)&En ie[n] 

Since D(q]]/in^) > 0, with equality iff q = /in we get 


^nifi-, B) = sup { Y Y q(//)logq(//) J-. (1.2) 

In literature (1.2) is known as the variational formula for the log partition function ^nifi, B). From 
(1.2) one can obtain a lower bound on 4>n(/3,.B) by restricting the supremum in (1.2) to product 
measures, i.e. q = Oieln] *1* ^ 'B{[q]Y. Therefore 

^n{fi,B) > sup M^’-®(q), (1.3) 

q&VilqlY 

where 


^n^{q) ■= {fin Y + ^ qi(?’)logqi(r) V . ( 1 . 4 ) 

[ {i,j)&Enre[q] ie[n] i&[n\,re[q\ ] 

The RHS of (1.3) is referred as the mean-field approximation for the log-partition function <l>n(/3, B). 
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Since the supremum in (1.3) is much more tractable than the one in (1.2), it is therefore naturally 
interesting to find graph sequences for which (1.3) is asymptotically tight. For the complete graph it 
has been long known that the mean-field prediction is indeed tight for both Ising and Potts measure 
(see [28, 29, 30]). However, for locally tree-like graphs (see [23, Definition 1.1]) this is not the case. 
Indeed, in [20] it is shown that the Bethe prediction is the correct answer for Ising measures on such 
graphs when the limiting tree is a Galton-Watson tree whose off-spring distribution have a hnite 
variance. In [26] it was extended for power law distribution, and finally in [23] it was extended 
to full generality. Moreover the same was shown be true for the Potts model on regular graphs in 
[23, 24], 

For the complete graph on n vertices one has 0(n^) edges, whereas locally tree-like graphs has 
only 0{n) edges (see Definition 1.2 for O(-), and 0(-))- Therefore, it is natural to ask for graph 
sequences such that n <C \En\ <C if one of the two predictions is correct for the limiting log 
partition function. Very few results are known about the asymptotics of the log partition function 
in this regime. See however [9, Theorem 2.10] which in particular shows that if a sequence of graphs 
converges in cut metric, then corresponding log partition functions converge. Also, it follows 
from [13, Theorem 1] that the mean field approximation is correct for the limiting log-partition 
function of Potts models on a sequence of growing graphs in Z'^, when d goes to oo as well. We re¬ 
derive both these results to demonstrate flexibility of our approach (see Theorem 2.4 and Example 
1.3.1(d) respectively). 

In this paper, we consider Ising and Potts measures (we consider a slightly generalized version 
of standard Potts model, see Definition 1.1) on graphs with growing sizes such that \En\/n —)• oo, 
as n —)• oo, and show that the asymptotic log partition function can be expressed as a variational 
problem (see Theorem 1.1). Building on Theorem 1.1, and focusing on asymptotically regular graphs, 
we prove the universality of the limiting log partition function in the ferromagnetic domain, and 
confirm that it matches with the one obtained from the complete graph (see Theorem 2.1). We 
further derive asymptotic log partition function for bi-regular bipartite graphs (see Theorem 2.3). 
Recently, in [9] the asymptotic log partition function was derived for graph sequences converging 
in cut metric. As a byproduct of Theorem 1.1 we give an alternate proof of the same (see Section 
2.3). For an outline of the proof techniques of the results we refer the reader to Section 1.4. 

1.2. Statement of main theorem. We will work with the following slightly general version of 
the Potts model. 

Definition 1.1. For q > 2, let J, h he a symmetric q x q matrix, and a vector of length q 
respectively. Also let A„ be a real symmetric n x n matrix. We define a hamiltonian on 

[g]"' by setting 

el q n q 

•= 2 ^ ^n{i,j) ^ Jrsd{yi,r)6{yj,s) + '^'^hr5{yi,r), (1.5) 

i,j=l r,s=l i=l r=l 

where y := (j/i,... ,yn)- Using we now define the following probability measure on [g]”: 

hn'^iy) ■= ^ ^^v{Hi'^{y)), ( 1 . 6 ) 

Zn{J,h) := J] 


where 
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Considering J to be the identity matrix Iq, h = i?(l, 0,0,... , 0), and to be the adjacency 
matrix of divided by 2\En\/n, we see that the probability measure in (1.6) is a generalized 
version of the standard Potts measure /i))’ . Throughout most of the article, we will fix a choice of 
J, and h. Therefore, to lighten the notation we will often write instead of 
Now similarly as before we define the log partition function 

:= log Zn{J,h). 


Arguing same as before we also obtain that 

^n{J,h)= sup I q(y)iT;f’'"( 2 /) - Y ‘l(y) log‘l(y)|> (1-7) 

‘ie'P([g]") y&lq]" 

and 

^n(T,h.)> sup M;(’'"(q), (1.8) 

where 


M 




*J=i 


r,s=l 


i=l r=l 


i=l r=l 


In Theorem 1.1 below we show that under a fairly general condition (1.8) is actually tight as n —)• 00 . 
Before going to the statement of Theorem 1.1, for convenience of writing, first let us introduce the 
following notation: 


Definition 1.2. Let a„ and bn be two non-negative sequences of real numbers. We write an = o{bn) 
if lim,^^oo = 0, whereas = 0(bn) implies limsup^^oo < 00 . Note that an = 0{bn) includes 
the possibility of an = o{bn)- Next we use the notation an = &{bn), if o-n = 0{bn) and bn = 0{an)- 


Note that for both Ising and Potts model we must assume some conditions on An to ensure that the 
resulting log partition is 0(n), or equivalently the limiting log partition function to be non-trivial. 
In this paper we work with the following condition: 


sup 

a:e[0,ll'* .^r , 
^ ‘ iG\n\ 


X, 


= 0(n). 


Y "4n(LJ> 

ie[n] 

Now let us denote || J||oo := max,,^^g[g] | and \\h\\^ := inax^gj^] \hr\. Since 


(I.IO) 


\H;^'^{y)\ Y I Y 1 ^riii,j)S{yj,s) + ||fi||^ Y Y 1 '^( 2 ^ 0 ^) 


iS[n],r,se[g] j&[n] 


i&[n] re[q] 


sup ^ I ^ +n||/i||^, 

rjSelij] i€[n] j&[n\ 

it follows by (1.10) that | supygjgjn Hn’^{y)\ = 0(n), which implies ^n{Jih) = 0{n) as well. 
When all entries of An have the same sign, condition (1.10) is equivalent to 

Pnlli := X] \^n{i,j)\=0{n). 
i,je[n] 


If (1.10) does not hold then there exists J, h such that the resulting log partition function <hn(T, h) 
scales super linearly. For example, if all entries of An are positive, J = (5Iq, then for any /3 > 0 
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an application of the mean-field lower bonnd gives lim^^oo -S) = -|-oo, thus proving that 

(1.10) is necessary for the log partition function to be 0(n) in general. If An has both positive 
and negative entries, (1.10) continues to hold for many well-known models with both positive and 
negative entries, snch as the Sherrington-Kirkpatrick model and Hopfield model (see Section 1.3). 

Of course we do not expect the mean-field approximation to hold for all matrices An satisfy¬ 
ing (1.10). For example, it is known that the mean-field approximation is not correct for the 
Sherrington-Kirkpatrick model [45], or Ising models on sparse graphs [21]. With this in mind we 
introdnce the following definition. 

Definition 1.3. Suppose An is a sequence of symmetric n x n matrices safisying (1.10). We say 
that An satisfies the mean-field assumption if tr(^^) = o(n). 


Now we are ready to state our first result. 


Theorem 1.1. If An satisfies the mean-field assumption, then 

1 


lim 
n—¥oo Tl 


4>„(J,fi)- sup M;[’^(q) 
qe'P([(?])" 


= 0 . 


Theorem 1.1 essentially says that if An is a sequence of matrices which satisfies the mean-field 
assumption then the mean-field approximation gives the right answer for the log partition function 
upto an error which is o(n). 

As an application of Theorem 1.1, one immediately obtains the following corollary. This corollary 
will be used in all of our applications involving graphs. 


Corollary 1.2. Suppose G„ is a sequence of simple graphs, and An is the adjacency matrix of 
Gn := {[n],En) multiplied by n/(2|K„|), where \En\ is the number of edges. Then the conclusion of 
Theorem 1.1 holds if n = o{\En\). 


Proof. Since 

sup ^ An{i,j)Xj 

®s[0,l] j^[n] 

(1.10) holds. Also we have 


^n{i,j) = n, 

i,j&[n] 


I 12 Mijf 

i,je[n] 




o(l), 


and so An satisfies the mean-field assumption. The conclusion then follows by Theorem 1.1. 


□ 


Below we consider few different choices of An, and verify for which of those the mean-field 
assumption is satisfied. 


1.3. Examples. This is broadly divided into two categories. 


1.3.1. Matrices An which are scaled adjancency graphs. 

(a) Let G„ be any sequence of simple dense labeled graphs on n vertices, i.e. it has 0(n^) edges. 
Let An be adjacency matrix of G^ scaled by n, i.e. An{i,j) := Since this scaling is 

equivalent to the scaling proposed in Corollary 1.2, it snffices to check that n = o{\En\). But 
this is immediate as \En\ = 0(n^). 
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(b) Let be a regular graph, and An{i,j) := In this case again the scaling is the 

same one as that of Corollary 1 . 2 , and so it snffices to check that n = o{\En\)- Since 2 |£'„| = ndn, 
Corollary 1.2 holds iff —>■ oo. 

(c) Let Gn be an Erdos-Renyi random graph with parameter pn- Setting An{i,j) := 

it again snffices to check by Corollary 1.2 that n = odE^I), in probability. Since \En\ has 
Bin(( 2 ),Pn) distribution, 

—>■!, in probability, 
n^Pn 

as soon as 'nPpn —>■ oo, the mean-field assumption holds in probability iff npn —?> oo. In particular 
the mean-field condition does not hold if Pn = - for some A < oo. 

(d) Let be the [—box of the d-dimensional integer lattice hd. Physicists have long 
been interested in studying Ising and Potts models on lattices (see [40, 47], and the references 
therein). For any finite d, setting Al^\i,j) := ^l{(f,j) G En} we note that A tr((Ai'^^)^) = 
O(^), and thus the sequence does not satisfy the mean-field assumption. So our results are not 
applicable on for finite d. However, if we allow d to go to infinity (at any rate) along with 
n, then Corollary 1.2 is applicable. One can check that this also implies that if we let d —>■ oo 
after letting n —)• oo, the same conclnsion continues to hold. Behavior of limiting log-partition 
function for the Potts model on for large d has been studied in [ 6 , 13]. We recover their 
results as an application of Corollary 1.2 

1.3.2. Matrices with both positive and negative entries. A general sufficient condition for 

(1.10) to hold is := supa,, 113 , 11^=1 llA„®jj 2 = 0{1). To see this note that an application of 

Canchy-Schwarz inequality gives 

sup An{i,j)xj < \fn sup jjA„a;jj 2 < \/n||^n|| sup Ila 3 ll 2 = 0{n). 

(a) Let An be a symmetric matrix with 0 on the diagonal, and An{i,j) = -^Z{i,j) with 

i.i.d. 


{Z(f,j)}i<i<,<oc ~ iV(0,l). 

This is the celebrated Sherrington-Kirkpatrick model of statistical physics introduced in [43]. 
Since ||A„|| = 0(1), in probability, in this case (see [3, Theorem 2.12]), (1.10) holds. However 
An does not satisfy the mean-field assumption, as 

- ^ An{i,jf = ^ ^ in probability. 

i,j£[n] i,j&[n] 

This is expected, as the log partition fnnction in this case is given by the Parisi formula, and 
not by the mean-field approximation. 

(b) Let 77 be an n X m matrix of i.i.d. random variables with P(? 7 jfc = ±1) = and let 

^ rjikrijk- 


k£[m] 

This is the Hopfield model of nenral networks, first introduced in [33]. In this case also one 
has ||A„|| = 0(1), in probability, when m = 0(n) (see [3, Section 2.2.2]), and therefore (1.10) 
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holds. Proceeding to check the mean-field condition one has 

An{i,jf = \ ^ [5{i,j) + 5{k,l) - 5{i,j)5{k,l)] 

and so the mean-held condition does not hold for m = 0 (n). 


nm^ -|- n^m — van 
rfi 


1.4. Proof technique. Establishing the conclusion of Theorem 1.1 for graphs whose adjacency 
matrix has a single dominant eigenvalne is mnch easier, since in that case the behavior of the log 
partition fnnction is governed by that eigenvalne. This is indeed the case for Erdos-Renyi random 
graphs on n vertices with parameter such that npn logn. Eor example, in this regime the 
largest eigenvalue equals npn{^ + o(l)) (see [36, Section 1]), whereas the second largest eigenvalue 
is o{npn) (see [32, Theorem 1.1]), providing a spectral gap. Similarly for random d„-regular graphs 
on n vertices, one also has a spectral gap, as long as > (logn)'*' for some 7 positive (see [17, 19]). 
More generally, any expander graph has a spectral gap, and therefore for such graphs one can show 
that the mean-field approximation is asymptotically tight. However, there are many graphs which 
are not expanders, such as the d-dimensional hypercube {0,1}'^ with d —)■ 00 . In this case the 
number of vertices in the graph is n = 2 '^, and it is well known that the set of eigenvalues are 
{d — 2i,0 < i < d} with multiplicity of d — 2i being (^). Thus the two largest eigenvalues are d 
and d — 2 whose ratio converges to 1 as d becomes large, and consequently there is no dominant 
eigenvalue. 

Even though there is no spectral gap in the hypercube, it is still the case that the number of 
big eigenvalues is small. Eor example, the largest eigenvalue is d, and the proportion of eigenvalues 
that lie outside the interval [—dd, dd], for any d > 0 , equals 


-^l{|d-2i| > dd} =P 



Ea 



where are i.i.d. Bernoulli random variables with ¥{Bi = 0) = P(Ri = 1) = .5. By weak 

law of large numbers the RHS above is o(l), as d ^ 00 , and so the proportion of eigenvalues which 
are comparable to the leading eigenvalue is o(l). Our proof makes this precise proving Theorem 
1.1 which covers not just the hypercube, but any sequence of graphs satisfying n = o(|E„|) (see 
Corollary 1.2). In fact the main condition of Theorem 1.1. i.e. the condition tr(H^) = o(n), can be 
rewritten as 

1 

n 

i=\ 

which says that the (properly scaled) empirical eigenvalue distribution converges to 0 in I?. And 
of course, as already pointed out that the mean-field approximation does not hold in general when 
\En\ = 0 ( 71 ), thus demonstrating that the conditions of Theorem 1.1, and Corollary 1.2 are tight. 

The main tool in the proof of Theorem 1.1 is a modified version of [16, Theorem 1.5]. Eor readers 
not familiar with [16], we informally describe the theorem and the ideas behind the proof of [16, 
Theorem 1.5]. Before proceeding, we define the notion of a net of a set. 


Definition 1.4. Eor any S C K"" and e > 0, a set S' C M"' is said to be a e net of S, if given s G S 
there exists (at least one) s G 5 such that ||s — s ||2 < e. 
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The theorem assumes that / : [0,1]” i-A M is a smooth function such that the set {V/(u) : u S 
{0, !}"■} has an ^/ne net with log |P„(e)| = o(n), and conclndes that 

log V = snp {f{u)-Iniu)} + o{n), 

ue{o,i}" we[o,i]"- 

where In{u) := u* log Uj + (1 — tij)log(l — Ui) is the binary entropy function, and u := 

{ui, .. .,Un). 

For the proof, they introduce a measure t'n(-) on {0,1}” given by Vn{u) oc exp(/(u)) for 
u := {ui,U 2 , ■ ■ ■ ,Un) G {0,1}"’. First it is argued that f{u) and f{u) are close on a set with 
high probability under Vni'), say An (see [16, Lemma 3.1]). Here ui is conditional expectation of 
Ui, conditioned on everything else. Therefore oxp(/(u)) can be well approximated by 

'^ueAn o^P(/(^))' Tnrning to evaluate the latter summation, it is further noted that g{u,u), and 
In{u) are also close on An (see [16, Lemma 3.2]), where for u G [0,1]", and w G (0,1)”, 

g{u,w) := ^ UilogWi + (1 - Uj)log(l - Wi), and Iniw) := g{w,w). 

i&[n] 

Therefore one only needs to control “ -^n(w)). To control the above, 

the snmmation over An is broken into smaller sets where each snm is over only those u for which 
u K, p, for some p G [0,1]”. Next instead of snmming over all choices of p G [0,1]”, the sum is 
restricted on the yTie-net of the image of the map u, using the set T>n{£)- Thus one obtains 

log ^ exp(/(u) + g{u, u) - In{u)) ^ log E E exp(/(p) + c/(ii,p) - 4(p))- (1-11) 

ueAn pGT>n(e) uiu^p 

Finally noting that 

^ g9(u,p) ^ 

uG{0,1}^ 

the proof follows as the size of T>n{£) is sub-exponential. 


In onr proof we follow the same scheme. However, there are several challenges that we had to 
overcome to apply this idea in our set-up. First, we need to hnd a net T>n{e) with appropriate 
properties. In our set-up, we need to find a -y/ree-net Pn(e) of the set {AnV : u G {0,1}"}. Since we 
have very limited assumptions on the structure of An, obtaining a ^/ne-wei is not straightforward. 
The main difficnlty comes from the fact that the eigenvalues of An can be nnbonnded. To overcome 
this, we split the range of the eigenvalues into its level sets, and then we choose nets of varying size 
across each of the level sets (for more details see proof of Lemma 3.4). 

Equipped with Lemma 3.4, a direct application of [16, Theorem 1.5] proves Theorem 1.1 for 
graphs Gfi snch that 


v / di{Gn) 

hmsnpn 2^ -—— 


< oo, 


( 1 . 12 ) 


where {di(G„), • • • ,dn{Gn)} are the degrees of G^. The hypercnbe does satisfy this condition, as 
does any regular graph. There are many graphs in literature such that n = o{\En\), but (1.12) 
does not hold. For example, let Gn denote the complete bipartitle graph Ka^^n-a^^ where is a 
sequence of natural numbers going to oo such that an = o(n). In this case the LHS of (1-12) equals 

n[anin - o^)^ + (n - an)al.\ _ 

Aal^{n-anY 
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which is not 0(1), as an = o{n). Since \En\ = an{n — an) with an oo, Corollary 1.2 is still 
applicable for Ka„^n-a„ but [16, Theorem 1.5] does not apply. 

To remove the requirement of (1.12) we modify the proofs of [16, Lemma 3.1], and [16, Lemma 
3.2]. In the proof of these two lemmas, at many places, supremum norm bound is used for several 
functions. The condition (1.12) arises because of that. Instead, we carefully use the assumption 
(1.10), and the fact that the hamiltonian in our set-up is a quadratic function. This part of the 
proof has been inspired from [15]. 

In Section 2 we provide several applications of Theorem 1.1. One of which is the computation of 
the limit for asymptotically regular graphs. To be more precise, we call a sequence of graphs to be 
asymptotically regular if the empirical distribution of the row sums of the properly scaled adjacency 
matrix converges to 5i, and if its mean also converges to one. Using a truncation argument we 
derive the desired result. We also find the limit for bi-regular bipartite graphs, for which we carefully 
analyze the solutions of some fixed point equations. Lastly, we identify the limit for a sequence of 
simple graphs converging in cut metric. This follows from a straightforward analysis upon using 
Theorem 1.1. 

1.5. Outline. The outline of the rest of the paper is as follows. As applications of Theorem 1.1, 
in Section 2 we derive the asymptotics of the log partition function for ferromagnetic Potts models 
on asymptotically regular graphs, that of Ising models (both ferromagnetic and anti-ferromagnetic) 
on bi-regular bipartite graphs, and that of Potts model on a sequence of simple graphs converging 
in cut metric in the Lp sense. Section 3 carries out the proof of Theorem 1.1 using three auxiliary 
lemmas, whose proofs are deferred to Section 4. Finally in Section 5 we prove the results appearing 
in Section 2. 

Acknowledgements. We thank Andrea Montanari for suggesting to look at the Ising measure 
on hypercube, Sourav Chatterjee for pointing out the reference [16], and Amir Dembo for helpful 
comments on earlier version of the manuscript. We also thank Sourav Chatterjee, Amir Dembo, and 
Andrea Montanari for many helpful discussions. We further thank Marek Biskup and Aernout Van 
Enter for pointing out the references [6] and [13] respectively. We are grateful to two anonymous 
referees for their detailed comments and suggestions which have improved the quality of this paper. 

2. Applications of theorem 1.1 

2.1. Asymptotically regular graphs. In Theorem 1.1 we saw that the mean-field prediction 
is asymptotically correct when An satisfies the mean-field condition. However, computing the 
supremum of Mn’^(q) may often be very hard for general matrices An- Restricting ourselves 
to the case J = f3Iq for /3 > 0, in Theorem 2.1 below we show that when the matrices An 
are “asymptotically regular” one can write the n-dimensional supremum as a one-dimensional 
supremum, and thereby providing more tractable form of the limit. In particular, setting hr = 
B6{r, 1), for asymptotcally regular graphs the limit is same as the one obtained for a Curie-Weiss 
Potts model. 

Theorem 2.1. (a) Let A„ satisfies the mean-field assumption, and each entry of An is non¬ 
negative. Also let J = filq, for some (3 > 0. Set TZnfi) := ^n{i,j)- If 


n^oo Jl 

2=1 


in distribution, 


( 2 . 1 ) 
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and 

1 " 

lim - = 1 , 

n^oQ Ti 

2=1 


( 2 . 2 ) 


then 


lim 

n—)-cx> Ti 


r/3 ^ ^ ^ 

q(?^) + y] hrq{r) 

qeP(M)L2^ ^ ^ 


(2.3) 


(b) In particular, the conclusion of part (a) applies in the following two eases: 

(i) Gn is a sequenee of dn regular graphs with dn —>■ oo, and An = ^l(i,j)eEn- 

(a) Gn is an Erdos-Renyi random graph with parameter Pn such that npn —>■ oo, and An = 
1 1 

npn ■ 


As an application of the above theorem, the following theorem derives the large deviation for the 
empirical measure on P([g]) defined by 

Ln{r) ■■= -^(5(yi,r). 

je[n] 

Below we recall a few definitions of large deviation theory which are necessary for our paper. 

Definition 2.1. Let (A, B') be a measure space equipped with a topology such that every open set 
is in B. A function / : A i—)• [0, oo] is said to be a rate function if it is lower semi continuous, i.e. 
for every a < oo the set {x G A : I{x) < a} is closed. The function I is said to be a good rate 
function, if further the set {x G A : /(x) < a} is compact as well. In particular if A is compact, 
any rate function is a good rate function. 

A sequence of probability measures on (A, B) is said to satisfy a large deviation on A with 
respect to a good rate function /(•), at speed re, if for every closed set F, and open set U, we have 

limsup — logPn(F) < — inf /(x), 
n^oo ^ xGF 

lim inf — log P^ (17) > — inf 7(x). 

n^Qo n xeu 


The large deviation reduces the concentration of measure problem to an optimization problem 
involving the rate function. Next we introduce a few notations which will be needed while solving 
this optimization problem. 


Definition 2.2. For /3 > 0, i? 7 ^ 0 let denote the unique solution of m = tanh(/3m + B) 

with the same sign as that of B. For /3 > 1, B = 0 let denote the unique positive root of the 
equation rre = tanh(/3m). The assertions about the roots of the equation m = tanh(/3m + B) can 
be found in [21, Section 1.1.3]. 


Theorem 2.2. (a) In the setting of Theorem 2.1, the sequence of empirical measures Ln satisfies 
a large deviation principle on V{[q\) with speed re with respect to Euclidean topology, with the good 
rate function Ip,h{h) ■= - min^gp([^]) Ig^hih), where 

f3f4 


1/3,hih-) ■= ^ (^PrlogPr -^ - hrPry 


r&[q] 


2 
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Consequently letting ^ '■= ^rgmin^g-p(jg]) Ip^hil^); for any 5 > 0 we have 

limsup-log/i„( min ||L„ -//||^ > <5) < 0. (2.4) 

n—)-oo ^ 

(b) Suppose we are in the setting of Theorem 2.1 with q = 2 (which corresponds to Ising model), 
(i) ///ii — /i 2 = 0 then 

• For (3 <2, for any <5 > 0 there exists e = e(/3, 6) such that for all large n we have 

^ E 1) - 2)} G [-,5, .5] I > 1 - 

\ i&[n] ) 

• For (3 > 2, for any (5 > 0 there exists e = e(/3, 5) such that for all large n we have 

Tni^'^{5{yi,l)- 5(2/i,2)| G [m^/2,0 “ <5, "l/3/2,0 + <^] 

\ *e[n] 

E 1) - ^{Vi, 2)} G [-m^/2,0 - 5, -m^/2,0 - 

\ *e[n] 

where mp^ is as in Definition 2.2. 

(ii) If hi — h 2 = B 0, for any <5 > 0 there exists e = e(/3, B, <5) such that for all large n we have 

Atn ( ^ E {'^(^*’1) “ G [m^/2,B/2 - 5, m^/2,B/2 + 5] 

\ *e[n] 

where mp^s is as in Definition 2.2. 

Remark 2.1. Theorem 2.2(b) gives concentration results for ^ 1) “ the 

Ising model, i.e. for the Potts model of (1.1) for q = 2. If the Ising model is formulated in such a 
way that the spins take values in {—1,1}, then one can easily see that the results of Theorem 2.2(b) 
are equivalent to the exponential concentration of average spin configuration in that set-up. This 
gives a complete picture for the ferromagnetic Ising model for all choices of the vector h, for 

asymptotically regular graphs. The optimization of Ip^h for general q for some specific choices of 
h is well known in the literature (see [7, 18, 27, 29, 30]). Using these results similar concentration 
results can be derived for the Potts model on asymptotically regular graphs, for those choices of h. 
We omit the details. 

2.2. Ising model on bipartite graphs. This section focuses on the Ising model {q = 2) on 
bipartite graphs. 

Definition 2.3. Let G(^a,b),(c,d) denote a bi-regular bipartite graph on a -|- 6 labeled vertices, such 
that the two partite sets have sizes a and b, and the common degree of vertices in those two partite 
sets are c and d respectively. Thus we must have ac = bd, which equals the number of edges. 

In particular G(^a,b),{b,a) denotes the complete bipartite graph with the two partite sets having sizes 
a and b. 
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Definition 2 . 4 . For any p € (0,1) and /3 £ M set := tanh(/3(l — p) tanh(/3|?s)). By 

elementary calculus it follows that 

(a) For P‘^p{l — p) < 1 the equation s = has the unique root 0. 

(b) For /3^p(l —p) > 1 the equation s = has a unique positive root, denoted hereafter by 

Thus the aforementioned equation has three roots, namely 0,s^^p, and —s^,p- Applying implicit 
function theorem, we also note that the function (/3,p) eA s^^p is a continuously differentiable in 
the open set {(/3,p) : p(l —p)f3‘^ > !}• 

Theorem 2 . 3 . Let G[a„,n-an),{cn,dn) ® sequence of bipartite graphs on n labeled vertices, such 
that 

lim — =pe(0,1), (2.5) 

n^oo n 

and Cn + dn ^ oo, as n ^ oo. Thus for q = 2, J = (512 for some /3 £ M, /i = 0 in (1.6), setting 
An to he the adjacency matrix of G(^an,n-an),{c„,,dn) scaled by Cn + dn we have 

(a) If (3'^p(l —p)<l, then 

(b) If (5“^p{l —p)>l, then 

J™^‘^n(/3,0) = ^ _)_ (1 _ 

where 5g^p{-) is as in Definition 2-4, and II{s) := —log log for s £ [—1,1]. 


2.3. Potts model on converging sequence of graphs in cut metric. The theory of dense 
graph limits was developed by Borgs, Chayes, Lovasz, and coauthors [10, 11, 37], and has received 
phenomenal attention over the last few years. Recent works of Borgs et al [8, 9] have extended this 
theory beyond the regime of dense graphs. One of the results in [9] is the asymptotics of the log 
partition function <h„(J,/i) of (1.6) of a sequence of graphs converging in the sense of cut metric 
to functions W that are unbounded. As a byproduct of Theorem 1.1 we are able to provide a 
short proof of their result. Before going to the statement of the result, we first need to introduce 
necessary notations, and concepts. These are taken from [8, 9]. 

Definition 2.5. A function W : [0,1]^ i-A M is called a symmetric function if W{x,y) = W{y,x) 
for all X,?/ £ [0,1]. Any symmetric measurable function W : [0,1]^ e-)• M which is integrable, 
i.e. llVFll;^ := /[q i ]2 \W{x,y)\dxdy < oo is called a graphon. 

Given a symmetric n x n matrix An, define a graphon on [0,1]^ by dividing [0,1]^ into smaller 
squares each of length 1/n, and setting BAl^(x,j/) := An{i,j) if {x,y) is in the (i,j)-th box, i.e. 
[nx] = i, Iny] = j. 

The cut norm of a graphon W is given by 


□ 


sup 

5,rc[o,i]. 


SxT 


W (x, y)dxdy 


After identifying graphons with cut distance zero, the set of equivalences classes of graphons 
equipped with the cut metric is a compact metric space. The cut norm is equivalent to the f—)• 

operator norm defined by 




sup 


f,9-\\ 


<1 


X 


More precisely, we have IJITIIq < 


001H.1 — 


[ 0 , 1]2 

n • 


W{x,y)f{x)g{x)dxdy 
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Next we introduce the notion of fractional partition. 

Definition 2.6. A q tuple of measurable functions p := (pi, • • • ,pq) : [0,1]'^ i—>■ [0,1]'^, such that 

^ Pr{x) = l,Vx G [0,1], 

r&[q] 

will be called a fractional partition of [0,1] into q classes. The set of fractional partitions of [0,1] 
into q classes will be denoted by FPg. 

Now we are ready to state the result about the limiting log partition function for a sequence of 
graphs converging in cut metric. 

Theorem 2.4. Let be a sequence of simple graphs, and let A„ be the adjacency matrix of G„ 
scaled by • If converges in cut metric to a graphon W, then we have 

lim -$„(J,h)= sup 
n^co n peFP,j 

where 

F-^'^{W,p) W Jrs f pr{x)ps{y)W{x,y)dxdy 

+ 2. hr / Pr{x)dx — / 2, Pr {x)\og Pr{x)dx. 

r&[q] FOd] J[0d] re[q] 

Theorem 2.4 follows from [9, Theorem 2.10], and [9, Lemma 3.2]. In section 5 we give a shorter 
proof of the same using Corollary 1.2. 

3. Proof of theorem 1.1 

We begin with a simple lemma which allows us to assume that the entries of An are o(l). 

Lemma 3.1. Let An be a sequence of matrices that satisfies the mean-field assumption. Then there 
is a sequence of matrices An with 0 diagonal entries which also satisfies the mean-field assumption 
such that maxjjg[„] \An{i,j)\ = o(l), and 

l4>n(J, h) - $„(J, h)\ = o(n), sup lM;[’'‘(q) - M;(’^(q)l = o(n), 

qe'PCM)"- 

where ^n{J,h) and Mn’^(q) are obtained by replacing An with An in the corresponding definitions. 

Proof. Since An satisfies the mean-field assumption, setting := n“^/^y^tr(A^), we see that 
En —>• 0. Now defining an n x n symmetric matrix An by 

■ b, An{i,j) . 
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one immediately has maxj jgj^] \An{i.ij)\ < Sn —^ 0. Extending the definition of Hn'^ \-) to V{[q]r 
(see Definition 3.1 below for more details), and defining Hn ’^ analogonsly one has 


sup 

qeP([q])- 




< 


< 


< 


q\\J\ 

2 

q\\J\ 

” ijeln] 

nq\\J\\oo^n , q\\J\\ 


'y ^ i)|l|A„(i,j)|>£„) + y ^ \-^n{h f) 

ie[n] 

gPII 


^ ^ ^n(bj) “ 1 “ 


n 


^2 


i&\n\ 


2 


V’^tr(A2) = o(re), 


(3.1) 


which immediately implies supqg-pQgjjn |M;(’^(q) — M;(’^(q)| = o(n). Also we have 


$„(J,h)-$„(J,h) 


log- 


< sup \H^’^{y)-H^’^{y)\ 
ye[q]" 


y- (y) 

< sup 
^eP{[q\A 

where the last inequality follows on noting that for any y S [q]^ setting qi(r) = 6{yi,r) one has 
q G 'Piiq])"'- Since the RHS above is o(n) by (3.1), the proof of the lemma is complete. □ 

For the remaining of this section and the next, without loss of generality we will assume that 
diagonal elements of An are 0 , and maxj jg[„] \An{i,j)\ = o(l). Next we state three lemmas which 
are necessary for proving Theorem 1.1. First, for ease of writing we introduce a few notations. 

Definition 3.1. For any y G [q]^ define the reg x 1 vector x := x(y) G Xn by setting Xir ■= S{yi, r), 
where 

Xn ■= I z £ {0, 1}"''^ : ^2 ^ir = 1 f G [n] > . 

[ re[q] } 

Let m : [0,1]"'' ^ [0, l]’"^ by 

q n 

■— y ^^ Jrs y An{i, j) Zjs. 

s=l j=l 

Note that, since diagonal entries of An are zero, mir{z) is free of {zis,s G [( 7 ]}. Next for every 
r G [q], define a map : (— 00 , 00 )'^ t-A (0,1) by 

pTUr 

Jrimi,m2, ■ ■ ■ ,mg) := 


E 


.e[q] 


Define another nq x 1 vector x by 


Xir ■ — ^ I — Vk^ ^ 7 ^ *) — ~^r (^Ril T hi, • • • , ITljq + hq') — 

and note that x G Xn, where 


exp (mjr-(®) + hr) 
ELi^xp (mis(®) + hs)' 


Xn ^ z G (0, l)”*^ : ^ Zir = 1 for all i G [n] 

re[q\ 


, 
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When Y := (hi)i6[n] ~ fJ-m let X,X denote the corresponding random vectors. Finally by a 
slight abuse of notation for any 2 £ [ 0 , 1 ]”''^ let Hn'^{z) stand for Fn{z) -|- Yli&[n] re[q] hrZir, where 
Fn : [ 0 , 1 ]”'^ i-A M is defined by 

Fn{z) ■— ~ ^ ^ JrsZirZjs-^nij'ij) — ^ ^ ^iriz)Zir — — ^ ^ JrsZj.A.fiZg, 

r,s£[q],i,j£[n] ie[n],re[g] ^i'SS[g] 

Zr := (zir)i<i<n G K”', and 2 ' denotes the transpose of Zr. In this notation Hn’^{x{y)) is the 
Hamiltonian of the Potts model at y £ [g]"’ in (1.6). 


With this notation we have 


Lemma 3.2. If An satisfies the mean-field assumption, then 




12 


Fn{X)-Fn{X) }=o{n^). 


Lemma 3.3. If An satisfies the mean-field assumption, then 


E 


(-In 


^ ^ {Xir X{^')xnij-{X^ 

I je[n],re[ij] 


= o(ra^), 


and. 


E 


(In 


2 n 


Y,{X^r-X,, 

re[5] \ie[n] 


= o{n^). 


(3.2) 


(3.3) 


Recalling the definition of net (see Definition 1.4) we now state our next lemma. 

Lemma 3.4. If An satisfies the mean-field assumption, then given any e > 0, there exists a y/ne-net 
Unis) of the set {AnV : u £ [ 0 , 1 ]”}, such that 

lim -log|P„(e)| = 0. (3.4) 

n^oo n 


We now complete the proof of Theorem 1.1 using Lemma 3.2, Lemma 3.3, and Lemma 3.4, 
deferring the proof of the lemmas to Section 4. 

Proof of Theorem 1.1. For 2 £ [0,1]”*?, and w £ (0,1)”*? define 

gn{z,w) ^ ^ Zir log Wir, In{z) := gn{z,z). 
is[n] re[g] 


Note that 

9n{x, x') In{x'j — ^ ^ {Xir Xir)logXir — ^ ^ (^^ir )(^ir(®)Thj.) ^ ^ (^^ir ^ir) lo§ ^ 

ie[n],rg[(j] ie[n],re[(j] ie[n],rg[g] 


where 


(Ti 


exp(mis(a;) + hg). 

s&[q] 


E 


Xir 



= 1, 


Since for each i £ [n] 
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we have that 

Qnix^x') ^ ^ {Xir Xi-p')(jXMr{x) hj-\ 

i€ln],r£[q] 

Therefore, from Lemma 3.3 we deduce that 


E 


7^n 


x,x)-4(x 


< 2 E^„ 


2 i 


^ {Xir - Xir)mir{X) 
ie[n],re[i}] 

r&[q] ie[n] 


+ 


2 l 


= oiv?'), 


where we recall \\h\\^ = max^gjg] \hr\. Similarly, recalling that Hn’^{z) = Fn{z) + J2i£[n] re[g] ^rZir, 
combining Lemma 3.2, and Lemma 3.3, we get 


E 


'{-^n 




<2E^„ 


Fn{X) - Fn{X 


+ 2 ||fi|loo 


= o(n^). 


E E 

_re[g] \*e[n] 

Hence, applying Markov’s inequality we see that G Mn) > 1/2, where 

jAn ■ — I® G ffyi . I Hn {x') (®) I ) lOn (®) In (® ) I — i 

for some 6n = o{n), and 
This implies that 

^n{J, h) < log 2 + log ( E exp(/fn’^(®)) J 

VcceAn / 

< log2 + (5„ + log ( ^ exp - In{x) + gn{x,x) j. (3.5) 

VtCgAn / 

Since 5n = o(n), it is enough to upper bound the rightmost term in the RHS of (3.5). This will be 
done by approximating the summation over An, by a summation over a suitable net of An- 

To this end, using Lemma 3.4 we obtain an -yree-net Pn(e) having a sub-exponential size, of the 
set {AnV,v G [0,1]""}. For any v := {vi,V 2 , - - - ,Vq) such that G Vn{e) for each r G [g], choose 
(if exists) a v(u) G An C C {0,1}"''? such that ||H„Vj.(u) — Vr \\2 < for all r G [g]. Here 
Vr(u) := (vir(’F))jg[„]. Also for any v(T) dehne 

T>(v(u)) := {x eAn'- \\AnXr - A„Vr(T )||2 < 2y/ne,r G [q]}. 

By triangle inequality it is easy to see that 

AnC IJ F{v{v)) = IJ T>(v(T)), 

VreVn{e),re[q] n^x>?,{e) 

and so 


- In{x)+gnix,x) < ^ 

xeAn 


v&Vl,{e) xeV(v(v)) 


exp 


iT;f’^(®) - In(x} + gn{x, x) 


(3.6) 
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We then claim that for any x S T>(y{v)), 

+ \gn{x,x) - gn{x,v{v))\ + |4(®) - /„(v(U))| 

< (q Halloo + + 2q^ II J||^ (ll^ll^ + l)ne + 4q^ || J||^ ne. (3.7) 

Since 5n = o(n) the RHS of (3.7) is bounded by C{q)ne for some finite constant C{q), for all large 
n. Thus using (3.5)-(3.7) and noting the fact that 

' = 1 , 


g9n{x,z) ^ 

x£Xn 


for any z G ATi, we deduce that 


^n{J,h) <log2 + C{q)ne + log ^ ^ ^ exp - In{y{v)) + gn{x,y{v)) 

^veVl(e) xeV(v{v)) 


< log 2 + C{q)ne + log ^ exp iT„(v(u)) - /n(v(u)) 

\v{v)evl(£) 

<log2 + C{q)ns + qlog\Vn{s)\+ sup M;(’'"(q), 

qe'P([q])" 


(3.8) 


where the last inequality uses the fact that for any x ^ setting qi(r) = Xi^ one has q* G 'P{[q]), 
for each z G [n]. Thus using the fact that log |Pn(e)| = o(n) we have 


1 


lim sup 

n^oo 


< Ciq)e. 


4>n{J,h)- sup M;(’'"(q) 
qeT’([(?])" 

Since e > 0 is arbitrary, letting e —)• 0 the proof completes. Therefore it only remains to verify 
(3.7). 

Turning to prove (3.7), we recall that for every v(U) G 'Dn\s), and any x G P(v(U)), both x and 
v(t;) are in the set An- Therefore 


\H^'^{x) - H^^^{x)\ < 6n/2, and \H^^^{y{v)) - < 5n/2. 


Thus, in order to bound \Hn'^{x) — Hn'"'{y{v))\, we only need to consider \Hn'"'{x) — Hn'"'{y{v))\. 
Now recall that 

^n’ (®) — 2 ^ ^ JrsXj-AnXg + ^ ^ /ij-l X^- 

r,se[5] I'&lq] 

Note that x G V{y{v)) we have 

\x'^AnXs - Vr(u)'AnVs(u)| < \x'^AnXs - x'^Anys{v) \ + |a;(.A„Vs(u) - Vr(u)'AnVs(U)| 

^ "v/ll ll^n^r Any II 2 "v/u ||7l^31g Anygi^x'j II 2 ^ 4:718. (3.9) 

Next we proceed to bound 11'®^ — l'vr(7i)|. From Lemma 3.3, applying Markov’s inequality it 
follows that \ Vxr — Vxt\ < (5^/2, for every x G An, and r G [(/]. Hence it remains to find an upper 

bound on x^ — Vr(u) . To this end, recalling that Xir = Tr(mji(a;) + hi, • • • ,mjq(ai) + hg) and 
noting that 

dJrimi, ■ ■ ■ ,mq) 




r-J,hi 


j.J,hf /_N 


9m o 


= ||T^(mi,--- ,mq){5{r,s) -T^(mi,--- ,m,)}||^ < 1, 


(3.10) 
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applying a multivariate version of the mean-value theorem we obtain that 


\xir - v(n).J < ^ |mis(®) -mis(v(n))| = ^ 

se[q] se[g] 


s'e[g] 

— ^ ll'^lloo 'y ^ |(^n®s')* (^nVs'('n))i| • 

s'e[g] 

This further implies that 
__ 2 

Xr-y{v) <g^||J||^ ^ \{AnXs')i - {Anys'{v))if < \\J\\l^ne'^. 

i&[n],s'&[q] 

Therefore, 

+ 5n 


< + 


^ \x'^AnXs - v^(n)^„v^(r7)| \\h\\^ ^ |Tai^ - lV(i’)| 


r,se[q] f’Slg] 

< (9ll^lloo + l)^n + 2 g^||J||^ne-Fv^||/i||^ ^ -v^(T) 


re[q] 


< {<1 Halloo + l)^n + 2g II J||^ (||h||^ + l)ne. 


(3.11) 


Next we proceed to bound \gn{x, x) — gn{x,y{v))\. To this end, we have 
\gn{x,x) - gn{x,v{v))\ 

< ^ logTr(tnii(a;) + h-i, • • • ,mig{x) + hg) - logTr(mii(v(n)) + hi,--- ,mig(v(v)) + hg) 

ie[n],'re[q] 

d log T. 


< ^ |mis(a;)-mis(v(n))| 

ie[n],r,se[g] 

<q ^ |mis(a;) - mis(v(l;))|, 
ie[n],se[i}] 


dm. 


d log T r 
drris 


where the last inequality uses (3.10) to conclude that 
This gives 

\gn{x,x) - gn{x,y{v))\<q ^ |mj 5 (®) - mj 5 (v(T))| 

je[n]se[i}] 


< 1 . 


= « E 

ie[n],se[i}] 


'y ^ Jss'{i^AiiXgi^i (TfiV^'('n))j} 

6 'e[q] 

^^^ll'^lloo X] \{AnXsi)i- {Anys'{v))i\ 


ie[n],s'e[(j] 

<g^v^ Halloo X] - ^nV^'(T )||2 < 2g^ ||J||^ne, (3.12) 

s'e[(?] 
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where the penultimate step uses Cauchy-Schwarz inequality, and the last step uses the fact that 
X G 'D{v{v)). 

Now it remains to bound \In{x) — /„(v(t;))|, for which we follow a similar program. Setting 
7 (t) := tlogt for t > 0, we have 

\In{x) - In{y{v))\ 


- X] 7(Tr(mii(®) + hi, • • • ,miq{x) + hg)^ - 7(^Tr(mii(v(t;)) + hi, • • • ,mig(v(t;)) -h hgj^ 
i£[n],r£[q] 

dijoJr 


iS[n],r,se[(j] 


dm. 


Using (3.10) gives 


9(7 o T^) 

dmg 


OO 


Tr(l + logT^){(5(r, s) -TJII^ < sup t|l + logt| < 1, 

is [0,1] 


and therefore 


\In{x) - In{y{v))\ < q ^ |mj5(a;) - mj 5 (v(i;))| < 2g^ II Jll^ne, (3.13) 

ie[n]se[g] 

where the last bound follows by arguments similar to (3.12). Finally combining (3.11)-(3.13) we 
arrive at (3.7), and this completes the proof. □ 

4. Proof of auxiliary Lemmas 

In this section we prove Lemma 3.2, Lemma 3.3, and Lemma 3.4. We start with the proof of 
Lemma 3.2. 

Proof of Lemma 3.2. To lighten the notation, we drop the subscript n in Fh, and write F through 
out the proof. Before we begin the proof let us introduce some notation: 


d 

Fir{x) .— ~ri -h'(^); and Fiy.^jg{x^ .— 

KJ JLjr' 


92 


dxirdxjs 


F{x). 


Equipped with these notation by mean-value theorem we have 

F{x) — F{x) = f ^ {Xir — Xir)Fir{tX + {1 — t)x)dt. 

i&[n\,r£[q\ 

Thus denoting A(ai) := F{x) — F{x), and Uir{t, x) := Fir{tx -|- (1 — t)x), we have 


E 


12 


F{X) - F{X) ~\= f E^„ ((W, - Xir)uir{t, X)A(X)) dt. (4.1) 

“'O ie[n],re[g] 

Hence to complete the proof it is enough to hnd upper bound on the RHS of (4.1) for each value 
of t € [0,1]. To this end observe that for any i G [n], r € [g] we have 


E 


'f-^n 


{{Xir - Xir)Uir{LX^"'^)i^{X^^^'^)) = 0 , 
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where is obtained by setting Xi^ = 0 in the random vector X. Therefore for each is [n], r S 

[g] it suffices to consider the difference 


(^{Xir - Xir)Uirit,X)AiX) - {Xir - (t, )) , 


and show that 


sup 

te[o,i] 


((Xi, -X,)(ui,(t,X) -Ui,(t,x(*'-)))A(X(*'-))) 

i£[n],r£[q] 


= o{ri^), (4.2) 


and 


sup 

te[o,i] 


^ E^„ (^{Xir - Xir)Uir{t,X){A{X) - A{X^"^ 

i£[n],r£[q] 

To establish (4.2), we first note that using (1.10) there exists Ci < oo such that 


= o{v?). (4.3) 


E^^ ({Xir - X,r){u^r{t,X) - (t, X^'")))A(X('"))) 


< CiuE 


'f-^n 


Ui 


\t,X)-Uir{t,X^^)) 


Since 


(4.4) 


Uir{t, X) = tmir{x) + (1 - t)mir{x) 

= tmir{x) + {1 - t) ^ JrsAn{i,j)Js{mji{x) + hi,-■ ■ ,mjq{x) + hq), 


je[n],s&[q] 

and mir(®) is free of {xjs}se[g], by chain rule the RHS of (4.4) can be bounded by 

ZL 

\dm.i 


Cin 


JrsAn{i,j) ^ 
ie[n],se[(j] s'e[q] 

<Cig^re||J||^ ^ \An{i,j)\'^, 




A-n (h j)Jr: 


ie[n] 


where the last step uses (3.10). This, on summing over i S [n], and r S [g] gives (4.2) by the 
mean-held assumption. Next turning to bound (4.3), we hrst write 


2(A(X)-A(X('")))= Jab 

a,bG [g] 






a, 6 e[(j] 


a, 6 e[g] 


X'aAnXii^^b - X^^^)aAn{ XOO 


XM^Xb - x;^„ ( X(-) 


(4.5) 


Here the notation ( XO^) j means the 6*^ column of the matrix Xh^). Now note for any a S [(?]\{r’}. 


_ (ir) 
Xa — Xa , 


X 


' A -r — 'rON'4 — V /d ^ ^ A — T- (A 'T 'l 


and 


XrAyiXr X^ ^ AfiXr ^ — Xir (AnXr)i ~i~ Xir (^A^Xr — ‘IXir {AnXr')i , 


where the last equality follows from the fact that An{i,i) = 0. Thus recalling the dehnition of 
mjr(®), we have 
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< 


Jab (^{Xir - Xir)uir{t, x){x'^AnXb - 

i£[n],a^b^r£[q] 

2 |mi^(a;)|(|mir.(ai)|-h |mi^(®)|) 


ie[n],re[g] 


< 2 < 

✓ 

Y 


n 

Y^^irixf. 

" 2 ] 


J&[q],i&[n] 

re[g] \ 

i=l \ 

j 


(4.6) 


where the hrst step uses the fact that \uir{t,x)\ < t\mir{x) -|- (1 — t)|mjr(®), and last step follows 
by an application of Cauchy-Schwarz ineqnality. 

Also the mean-field assumption implies that Amax(^n) = o(^/n}, and therefore we have 
mir{xf < q II J||^ Y W^nXsWl < Q\\J\\lo ^Lx(^n) Y 

ie[n\ se[(?] se[q] 

By similar argnments, from (4.6) we deduce that 


= o{jn?). (4.7) 


Next we consider the second term in the RHS of (4.5), where using first order Taylor’s theorem, 
upon application of chain rule, followed by (3.10), gives 

x'^AnXr - x'^An{x^'^'^)^ ^ = (j. A;) | ^ 

j,k£[n] 

~ Y 1 ^jo.^n{j,k)An{i,k)^i ,k,b,ri 

j,fce[n] 

for some ii^k,b,r, such that \^i,k,b,r\ < 9p|loo- Denoting ||A„||^ := sup^j |A„(z, j)|, and summing 
over i & [n], a, b,r £ [q] this gives 


sup 

E 

^ ^ Jab 

- xU)'.4„xf >' 

) 

te[o,i] 

ie[n],re[g] 

\ o,,b^[q] 


] 


^ Jab \^{Xir Xif')Uir{t, x'j^X^A^X^ X^AfiyxJA 

a,b,rG[q],ie[n] 


^ ^ ((Xjj. Xjf )Xjj.1Tlfef)(®)^n(f) ^)Ci,fc,fe,r^ir(A, ®)) 
i,k£[n],b,r£[q] 

< «I|A„ILI|J|L E (|mir.(ai)| -h |mi,.(®)|)|mfcfe(:K)| 


i,k&[n],b,r&[q] 


= gPnIlooPIloo 


E 


mi,.UK 


I ie[n],re[g] 


Y i^fcfe(®) 

I A:e[n],6e[g] 


+ 1 y] |m,,(rK)i |mfc 6 (£) 

Oe[n],r-e[(?] / \fce[n],fee[q] 
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Now using (1.10) we obtain 

^ |mir.(a;)|, ^ \mir{x)\ = 0{n). 

ie[n],re[(j] je[n],re[(j] 

This together with the fact that ||^n|loo = 0 ( 1)5 implies that the RHS above is o(n^), thus giving 


sup 

te[o,i] 


'y ^ I (.^ir ^ir)Uir{t, ^ ^ Jab 

iS[n],re[ij] V a,b£[q] 




= o(n^). 


(4.8) 

Finally, considering the third term in the RHS of (4.5) and using first order Taylor’s theorem again, 
we also note that 

^ ^ - (^xW)^ ^An{j,k) 

j,ke[n\ 

= Y] (xh'’)) ^ij^a,rAn{j,k)An{i,k). 

j,ke[n\ 

From this, proceeding similarly as in the proof of (4.8) we have 

Yj “ Xir)Ui{t,x)^x'f,AnX(^^)a “ 

i,j,ke[n],a,b,re[q] 


^ ^ (^{Xir Xir')Xir^ja{x^^A'jAji(^i,k')^i ja,r'^ir(tjX^'^ 

*dS[n],a,re[i}] 

< 9pn|loo ll'^lloo Y (|mir-(a;)|+ |mi^(x)|)(|mja(a;(*’'l)|) =o(n2) 


lie[n],a,r-e[q] 


as before, and so 


^ (X,,-X('^))u,,(t,X) ^ Jab 


ie[n],re[(j] 


a,6e[g] 


X'A„(x(^Y - X(^Y 


Finally combining (4.7), (4.8), and (4.10), the proof is complete. 

Now we prove Lemma 3.3. 

Proof of Lemma 3.3. First we prove (3.2). To this end, for any x £ Xn define 

G{x) ■.= Y, {Xir - Xir)mir{x) 


and note that 


iS[n],re[(j] 

( {Xir - Xir)mir{X)G{X^^^'>)] = 0. 


Thus we need to show that 

^ E^„ [{Xir - Xir)mi,r{X){G{X) - G(X(*’’)))] = o(n2). 

ie[n],re[q] 


(4.9) 


= o(n^ 

(4.10) 

□ 
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To this end, we first observe that 

G{x)-G{x^''''^) = 2xirxnir{x)+ ^ -mjs(a;)^+ ^ (*'’)). 

ie[n],se[ij] je[n],se[q] 

(4.11) 

For the first term in the RHS of (4.11), proceeding as in (4.7), by a Cauchy Schwarz argument we 
have 


^ ^ (Xjf 

ie[n],re[(j] 


< ^ mir{x) 

ie[n],re[(j] 


^ = o(n^) 


giving 




E,.. E 


ie[n],re[g] 



= oin^). 


(4.12) 


For controlling the second term in the RHS of (4.11) hrst note that mjs{x)—mjs{x^^^'^) = An{i,j)JrsXir- 
Thus proceeding as in (4.6) again we further have 


^ ^ {Xir Xir^Vilij-(^X^Xjs(xXijs(^x'j TTljs(lC^ ^)) 
*dS[n],r,se[(j] 


^ ^ (^Ir Xir^Xi^XXlij-i^X^XXli'fi^x') 
je[n],re[g] 


< ^ |mir.(, 

ie[n],re[q] 




which is o(n^) by a Cauchy Schwarz argument as in the proof of (4.7). Thus we have 


E 


^ iX^r - Xir)m,,riX)Xjs{mjs{X) - 


= o{n‘^). (4.13) 

*Je[n],r,se[(j] 

Finally for controlling the third term in the RHS of (4.11), applying first order Taylor’s theorem 

yields that Xjs — (xh’')) = An{i,j)^ij^r,s with |Cij,r,s| < q ||T||qq. Thus we have 

V y js 


(Xir - Xir)mir{x)mjs{x^'''''>)^Xjs - | 

i,j&ln],r,se[q] ^ 


< q IIA 


loo 11*^1100 


^ |mjr.(ai)| I +ng^||A 

I «e[n],re[q] 


,,2 lljf V 

>T-ll00 Iloo / . 


m,> X 


ie[n],re[i}] 


which is o(n^) by arguments similar to the proof of (4.8). This gives 


E 


f^n 


(Xir - X,r)mi,riX)mjs{X^Y{Ys " (^) . J 


i,j&[n],r,se[q] 

which on combining with(4.12) and (4.13) completes the proof of (3.2). 
Next to prove (3.3), we define 

Gj. (x) ^ ^ (Xjr Xj^), 


= o(n^), (4.14) 


le n 
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and therefore 


Thus observing that 


Gr{x) — = Xir — ^ |xjr — 

je[n] ^ 


E 


l-^n 


{Xir - X,r)Gr{X^^’^'>) 


= 0 , 


for any i G [n], r G [g], we only need to show that 

^ E^„ [{Xir - Xir){Gr{X) - = o{r?). 

ie[n] 

This can be done proceeding similarly as above. We omit the details. 


□ 


Now we prove Lemma 3.4 . Before going to the proof let us introduce the following notation: 

For r G N and i? > 0 let Br{R) denote the Euclidean ball of radius R in dimension r, i.e. 

Br{R) := {n G M’' : ||n ||2 < R]. 

The proof of Lemma 3.4 also requires the following standard estimate on an r^-net Br{R)- Its proof 
is based on simple volumetric argument. We refer the reader to [39, Lemma 2.6] for its proof. 

Lemma 4.1. For any R,r] G M, and r G N, there exists an rj-net of Br{R) of size at most 
max{l, {3R/r]Y]. 


Proof of Lemma 3.4- Let {Ai(A„), • • • , An( 74 „)} denote the eigenvalues of An. Fixing e G (0, 1), let 

Nn denote the number of eigenvalues of An which are greater than e /2 in absolute value. Since An 

satisfies the mean-field assumption, by Chebyshev’s inequality we have that 

0 < lim —- < lim —^ Xi{An)‘^ = 0. (4.15) 

n^oo Tl n—)-oo TIS'^ 

iG[n] 

Set i = in '■= \log 2 '/n\, and for 1 < A: < ^ let Jfc := {1 < z < n : 2 ^“^ < |Ai(^ri)| < 2 ^}. 

Thus with Iq ■= {i ^ i ^ n : e/2 < |Aj(^„)| < 1} and I := and nsing the fact that 

tr(^^) = 0{n), we have 


^|4| = |/|=iV„ 


k=0 


For 0 < A:, j < .£, if Ifc 7 ^ (/> we let Ck{j) denote an e2 ^/\Ik\-net of the set B|/^|(2'^). By Lemma 
4.1 we may and will assnme that 

/ M f /6\I4| / 2^+^' \I4|' 

|Cfc(j)| < maxjl, 


Setting 

Me) ■■= U 

0<J0 Jl,- JZ<^:ELo 

we first claim that 

1 


V\h\' 

{Co(jo) X Ci(ji) X • • • X C^ji)} 


(4.16) 


lim -log|5n(e)| = 0. 
n—>-cx) Ti 


(4.17) 
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Deferring the proof of (4.17) and setting 

^ ^ • C .— (Cj)jg7 G iSn,(£) j* , 

iel 


where pi,p 2 , ■ ■ ■ ,Pn are the eigenvectors of A^, we will now show that T>n{£) is indeed an y^e-net 
of {AnV : G [0,1]"'} having a sub-exponential size. Since (3.4) is immediate from (4.17), it only 
remains to show that 'Dn^s) is a y^e-net. 

To this end, fix u G [0,1]"', and expand v in the basis {pi,P 2 , • • • ,Pn} as 

n 

V = ^aiPi, 
i=l 


where ai,a 2 , ■ ■ ■ , G M satisfies 


For 0 < k < i setting 


< n, 

i=l i=\ 

^ vector c G 5n(e) such that 


(4.18) 


Av - '^Xi{An)CiPi 


< y/ne. 


i£l 2 

If Ik ^ (j) for some k, setting jk := max(0, |'log 2 Sfcl) we note that (ayi G Ik) G i3|/^|(2-^'=), and so 
there exists {ci,i G Ik) G Ck{jk) such that 


(4,19) 

i^Ik 


By our choice of jk we have 2-^'' < 2sk if > 1, and = 0 if Sfc < 1. This gives 
£ £ 

^ 22 ^'== 22 ^'=+ Y 2 ^^" < ^ + 4^4 < ^ + 4 

k=0 k\ji^=0 k:jk>l k=0 i=\ 


Y 


Av 'y ^ Xi(^An)ciPi 
iei 




where the last step uses (4.18). Thus we have shown that c = {ci)i^i G 5n(e). Finally, recalling 
that |Aj(74„)| < 2^ for any i ^ Ik, we note 

2 £ 

= Y^ ^i{An)'^ {O-i — Ci)"^ + Y^ ^i( 2 ln)^ 

2 fc =0 ie/fc i^I 

2 ^ I r I 2 ^ 

k=0 i=l 

- , ne^ 

<{Nn + n)— < —, 


where the first two inequalities follow by an use of (4.19) and (4.18), respectively. Thus we have 
shown that IXn{e) is indeed an yTie-net. 
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Therefore to complete the proof it suffices to show (4.17). To this effect, fix any jo,ji,j 2 , • • • ,je 
such that ^ ~ ^(jO; ji; ■ ■ ■, je) '■= {^ < k < i : Q X > e^/\Ik\}■ Thus 

we have 


log |Co(jo) X Ci{ji) X • • • X Ci{ji)\ < ^ 141 log 

keK. 


6 2'=+^'= N 




(4.20) 


Further denote = Nl^{jo,ji, ■ ■ ■ ,ji) '■= J2keic obviously < Nn- Now using Jensen’s 

inequality, applied for log(-), we have 




k&K 


v1^ 


k&K 


eNL 


\ 


E2“I4 


k=0 




E 2"" \ . 


A:=0 


(4.21) 

where the last step follows by Cauchy-Schwarz’s inequality. Since for any A; > 1, and any i £ 4) 
we have |Aj(^n)| > 2^“^, we therefore deduce that 

i £ n 

^22^141 < |/o|+4j]^|4(7lOP < A^n + 4^4(A)2. 

A:=0 k=l i£lk ^=1 

Now note that Assumption 1.3 in particular implies that -^i(^n)^ < Cn for some positive 

constant C. Therefore recalling that using (4.20), and (4.21), we deduce that 


loglCoO'o) X Ci(ii} X ■■■ X Ceij()\ < A^n(logt^ + 

<iV„(logg)+JV„log "^^OC + l) ^ (,22) 

\ E' lyji 

where the last step uses the facts that lim^^oo = 0) x i-A a:log(l/x) is increasing near 0. 
Therefore from the definition of the set Sn(£) it now follows that 


|5n(e)l < (1 +4^+^exp 


.r /n 6\ , nV5 4C + 1 

A„( log -) + Nn log —— - - 

V e/ Nn 


Now using the fact that lim,i 


JVr,. 


= 0 again the proof completes. 


□ 


Remark 4.1. Note that the proof of Lemma 3.4 goes through as long as the following hold: 


^ /t ^ /t 

-'^Sxi(An) ^ So, limsup-^Ai(A„)2 < OO. (4.23) 

n ^ n^oc n ^ 

2=1 2=1 

For example, if An is the adjacency matrix of the n-star graph Ki^n-i then it does not satisfy the 
mean-field assumption. Indeed, this follows from observing that 


^ (^n) 


2\E{Ki^n-l)\ 

n 


2 . 


However, all but 2 of the eigenvalues of the adjacency matrix are zero. Therefore (4.23) holds here, 
and hence proof of Lemma 3.4 goes through unchanged in this case. For the n-star graph, one can 
directly check that the mean-field approximation (1.8) is tight. In light of this and similar other 
examples, we believe the mean-field assumption can be weakened to (4.23), and we conjecture that 
the conclusion of Theorem 1.1 continues to hold as long as (4.23) holds. 
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5. Proof of Applications 


5.1. Proofs of Theorem 2.1 and Theorem 2.2. In this section we compute the limiting log 
partition function for asymptotically regular graphs. This is followed by the proof of large deviation 
principle for the empirical measure of the colors for such graphs. 

Proof of Theorem 2.1. (a) Since An satisfies the mean-field assumption, applying Theorem 1.1 we 
get 

lim - sup M;[’'^(q)] = 0. 

n qeP(M)" 

Proceeding to estimate M;[’^(q), fixing <5 > 0, we denote 


Thus we have 


^n\hj) '■= A(bj)l|7e„(i)-i|<<5l|7e„0)-i|<5- 

q n 

r=l i,j=l 

.. q n .. q n 


r=l i,j=l 


r.\nn{i)-l\>& ’•=1 i=i 
q n 


■t H •>' 


*=1 

.. q n „ q n 

r=li,j=l j:|7?,^(j)_l|>5 r=l j=l 

Note that the second term in the RHS of (5.1) is bounded above by 

2? n-, /■\ 2q 


n 


i=l i=l 


= 2rf, 


(5.1) 


(5.2) 


i:|77.„(i)-l|>(5 

where 

- n n 

- (1 - 

1=1 i=l 

Considering the first term in the RHS of (5.1), and noting that An'^ is a symmetric entries with 
non negative matrix whose row sums are bounded by 1 + d, we apply Gershgorin circle theorem to 
obtain 

q n 1 I r ^ ^ 


1 H It' 1 \ X ^ 


(5.3) 


r=l i,j=l 


r=l i=l 


Combining (5.2)-(5.3), along with the expression for M^’ (q), we get 

sup -M;[’^(q) < sup |^^q(r)2^/i^q(r) - ^q(r)logqi 
-PCMW n qeT’(M) . . 


qe'PfM) 


r=l 


r=l 


r=l 


r)'t + Y^^ + 


. (5.4) 
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Now note that an^ —)• (5 as n —)■ oo by (2.1)-(2.2). Thus taking limits as n —)• oo on both sides of 
(5.4), we get 

1 r/3 1 

limsup sup < sup - V q(r)2 + V/i,.q(r) - V q(r) log q(r) [ + 2q/3(5, 

n^oo q&V{[q]r ^ q6P(M) ^ ^ ^ ^ 

from which the upper bound of (2.3) follows, as 5 > 0 is arbitrary. 

For the lower bound, taking a supremum over all q = HILi fl* such that q* is same for all i, we have 
sup -M;(’'^(q) > sup { - ^ q(r)2-^7^n(^) + J]]/i,.q(r) - ^ q(r) log q(r)|. 


qeVM)’- n 




2 —' n 

r=l 2=1 r=l r=l 


which on dividing by n, and taking limits using (2.2), gives the lower bound in (2.3). This completes 
the proof of part (a). 

(b) To prove part (i), we note that TZn{i) = 1 for all i G [n], and thus both (2.1) and (2.2) hold 
trivially. 

Turning to prove part (ii), note that TZnii) = with di{Gn) denoting the degree of vertex 

i G [n]. Since the number of edges \En\ has a Bin(( 2 ),Pn) distribution. 


n 

- ^7^„(^) 


2\E„ 


i=l 


n^Pn 


A-l, in probability. 


This verihes (2.2). To check (2.1), fixing 5 > 0, it suffices to check that lim^^oo = 0, where 

n 

^ ^\di{Gn)-npn\>np„S- 


2=1 


This follows using Chebyshev’s inequality: 

= -J2 - npn\ > npnd) < ~ ^ 0, 


iS n 


as npn —^ OO- 


□ 


Now as an application of Theorem 2.1, we derive the following large deviation principle. As a 
byproduct we also get an exponential concentration of the average sample spins in Ising model. 


Proof of Theorem 2.2. (a) The proof of this theorem is based on Baldi’s theorem (cf. [25, Theorem 
4.5.20]). To this end, we first need to compute logarithmic moment generating function. Fixing a 
vector t = ..., tq) G using Theorem 2.1, one has 


i logE^^e^^-sM irLnir) h + t)- 4>„(J, h)] 

n n 


sup 

q67^(M) 


— sup 
qe'P(M) 


[2 ^ ^ + tr)q{r) 

r=l r=l r=l 

[2 ^ ^^• 

r=l r=l r=l 
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Denoting the RHS above by A(t) we note that 


A{t) 


sup 


"I ^ ^ tr^r 

re[g] 


Therefore, applying the duality lemma (see [25, Lemma 4.5.8]), we have 


1/3,h{f^) = sup 
teM9 


{E 

r£[q] 



Next note that the set Vilq]) being compact, the law of Ln{-) is automatically exponentially tight. 
Thus using the fact that A(t) < oo for all t S applying [25, Theorem 4.5.20(a)] we obtain that 
for any closed set F C T* ([<?]), 

limsup-log/in(-bn ^ F) < - inf lB,h{A)- 

n^OQ p-G-F 

To derive the lower bound we use part (b) of [25, Theorem 4.5.20]. To this end, we note that it 
is enough to prove that any // G 'P{{q]) is an exposed point of i-e. for any v G F{[q]) with 

/r 7 ^ p there exists t G such that 

r/3 '' 1 r/3 '' 1 

Hr log Hr + '^{hr+ tr)Hr DEifE-'i-E-'^ logt'r- +^(/ir +tr)l^r|- 

re[g] r=l r=l r=l re[q] r=l r=l r=l 


This follows on noting the existence of r G [g] such that Hr > ’^r, and then choosing F large enough 
for all r such that Hr > ^r, and F = 0 for all r such that Hr ^ i^r- 

Now to prove (2.4), we note that the function h Ii3,hiF) i® a non constant analytic function on 
a compact set, an thus the infimum is attained on a finite set Thus (2.4) follows from the 

large deviation principle on noting that the set {v G 'P{[q\) : min^g;^^ “ A^lloo — i® closed, 
(b) By the last conclusion of part (a) it suffices to minimize the function Ip^hiF)- To begin introduce 
the variable m = — ^2 S [—Ij 1 ] and note that 





B , , 

—m + H{m) 


■/3 hi + h2' 
.4 2 . 


The optimization of this function has been carried out in [21, Section 1.1.3], where it is shown that 
optimum is at m = 0 for /3 < 2, B = 0, at m = ±m^/ 2 ,o for /3 > 2, B = 0, and at m = m^/ 2 ,B /2 for 
/3 > 0, B 7 ^ 0. This, along with the symmetry of the Ising model for B = 0 completes the proof of 
part (b). 

□ 


5.2. Proofs of Theorem 2.3 and Theorem 2.4. In this section we prove the convergence of 
log partition function for bi-regular bipartite graphs, followed by the same for a sequence of graphs 
converging in cut metric. 

Proof of Theorem 2.3. To begin first note that (2.5), along with OnCn = (n — an)dn implies Cn = 
Q{dn), and therefore we deduce that Cn and dn individually converge to 00 , as n —)• 00 . This further 
implies that \En\ = UnCn n. Thus Corollary 1.2 is applicable, and it suffices only to consider 
the asymptotics of sup^g-pQ^^n Mn’^(q). For computing the supremum in this setting, denoting 
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bn := n — Qn we have 

Mn’®(q) = /3 4^\r)c\f\r)An{i,j) - (^) log 


[o-nlj G[&n] ,rG [2] 


^G[Q.n],^^G[2] 


q5.^1(r)logql^l( 


( 2 ), 


ie[6„],re[2] 

Introducing variables := qj-^^(l) — q^^^(2), and := qj^^(l) — qj^^(2), and noting that 
Efce[ 2 ] = Efce[ 2 ] the R-HS above becomes 


/3 




(1)^ 


-(2)^ 


ie[a„],je[6n] 


le an 


J6[bn 


/3 






_(2)n. /3 OnCn 


ie[an],je[bn] 


*e Qn 


j6[6r, 


2 Ctx -j- dn 


(5.5) 


Hence, it suffices to maximize (5.5) over the set {si G [—1, l],z G [onliSj G [—1,1], j G [fen]}- 

Fixing n first note that the optimum occurs at an interior point where sl^^ G (—l,l),sl^l G 
(—1,1), for any i G [on], j S [fen]- This is due to the facts that for any i G [on], we have 


® lvl£-® 


ds) 


( 1 ) 


-( 2 ) 


= + 00 , 


d 


ds. 


( 1 ) 




B 




= —oo, 


and a similar argument holds for s) for j G [fen], as well. Thus differentiating with respect to 


sl^l,sl^l and equating to 0, any optimum satisfies the following equations 


sfl =tanh(^/3 E An{i,j)5f^ 
i6[6n] 

=tanh(^/3 E ^n(^,j)sf^ 

We now split the proof into four different cases. 

Case 1: /3 > 0, and /3^p(l — p) < 1. 

Since P > 0, and H{x) = H{—x), without loss of generality we can assume that for any optimum 
we have s-^\sl^l > 0. Next combining (5.6), and (5.7), for every i G [an] we get 

sfl=tanh(^/3 E ^n(*, j) tanh E ^n(i, fe)s[ 

j&[bn\ fee[a„ 

Letting sl^^ := argmaxjgj^j^] sl^^, (5.8) further yields 

= tanh (^/3 E -4n(i, j) tanh (^/3 E An{j,k)5[ 

je[bn.] fce[a„ 

< tanh (/3 E A(Lj)tanh(/3 E An{j,k)5\l 

je[bn.] fce[a„ 


(5.6) 

(5.7) 


.( 1 ) 


(5.8) 


-( 1 ) 


-( 1 ) 

'’*0 


= tanh 


Cn T dn 


■ tanh (05 


( 1 ) dn 


^0 p -I- d 

C-72 T 




dn (5, 
Cn+dn 


(1)^ 

*0 


(5.9) 
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It is easy to note that 


drj 


'/ 3 , 


Cji+dn 


-(s) 


ds 


= /3 


)2 Cnd' 


n^qo n2 \ ^ i 


Thus s i-A 


OO 

Tfo dn (s) is a contraction. This implies that for any s > 0 

’ Cn + dn 




Cn + dn 


.(+ = 




(s) - 


Cn+dn 




Cn + dn 


(0) < Is — 0| = s, 


for all large n. Using (5.9), for large n, we therefore deduce that s^^^ must be equal to zero. This 
further implies that sj^^ must be equal to zero for all i G [a^]. Similar arguments hold for 5j^\ 
proving s)T = Q for all j G [6^]. Plugging in the values of s) , and 5j in the RHS of (5.5) we have 

sup M^’°(q) = I +nlog2, 

qe'P([(j])’^ ^ Cn -t Cln 

which on dividing by n and taking limits proves Case 1. 

Case 2: /3 < 0, and /3^p(l — p) < 1. 

Note that one can rewrite Mn’°(q) as 


(-/ 3 ) 


E 




ie[an],je[b„] 


U’)A(i.i)+ E E n{-.f>) + 


(2)\ /3 


2 Cyi ~\~ dn 


i£[an] j^[bn] 

Since /3 < 0, one can argue that for any optimum we must have sj^^ and —non negative for all 
i G [an], and j G [bn]- The rest of the arguments is similar to Case 1. We omit the details. 

Case 3: /3 > 0, and /3^p(l — p) > 1. 

We begin by noting that 


dp a dn 

’c„ + d„_(^) ^ 0 


,2 Cndn 


ds 


sech^ ( tanh 


(Cn T dn')'^ \ 


dn 


sech^ 


Cn T dn 

which is decreasing in s, and goes to zero as s —)• oo. Further noting that 


d, 




Cn T dn 


dp a dn 

’c„+d„_(^) 


ds 


2 Cn.d- 


= 0 

B=0 (Cfi + dn')'^ 


‘n n^qo q 2 t\ ^ ^ 

A /? p(l -p) > 1, 


we deduce that there is a unique positive root of the equation s = pa dn (s), denoted by s^ dn , 

Cn+dn Cn+dn 

for all n large enough. Also (5.9) implies 


maxsf^ =sSg^ <s^ d 


ie[a„] 


Cn+dn 


By a similar argument we also deduce 


( 1 ) \ 
mm s,- > s. 


and so s,-^^ = s 


illill s 

/o) 

io dn for all i G [on]. Plugging in this solution in (5.7) gives 

’ Cn + dn 

j G \bn\. Thus the optimum solution is 

{'2') 

'R dn , for all i G [an], s) = s« c„ for all j G [bn]- 

P’Cn+dn d Cn + dn 


s,- = Sg Cn for all 

J P’Cn + dn 
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Plugging in this optimal solution in the RHS of (5.5) gives 

f^CLnCn 


sup Mf’°(q) = 

qeP([q])'* + «nj 


1 +S 


/3, 


cn + dn 




Cn-\-dr] 


)| + dnH 


’/ 3 , 


cn + dn 


+ bnH (s 


’ cn+dn 


from which part (b) follows on dividing by n and taking limits, on noting that the function p t— 
is continuous. 


Case 4: /3 < 0, and /3^p(l — p) > 1. 

This can be done by combining the arguments of Case 2, and Case 3. We omit the details. 


Note that the above four cases complete the proof, barring the convergence of <l'n(/3,0) at /3 = 
i/5c(p) = ±\/p(l — p). To complete the proof hrst we use the fact that |tanh(x)| < |x|, for any 
X / 0, and deduce that —)• 0, as /3 —±/3c. This implies that 

i.(/3,0) := + (1 - 

is continuous for all (3. Since {<l>n(‘,0)} are convex functions, and limit of such functions is also a 
convex function, using the fact that limsup^^go 0) < oo at /3 = ±/3c(p), the proof completes 

by a standard analysis argument. □ 

Remark 5.1. Even though we do not pursue it here, by combining the arguments of Theorem 2.1 
and Theorem 2.3 one should be able to prove Theorem 2.3 for a sequence asymptotically bi-regular 
bipartite graphs. We believe a similar universality result for the limiting log partition function 
for the q Potts model holds for general q-partite graphs as well, though proving it will require an 
analysis of fixed points in q dimensional equations for q > 2. 


Finally we prove Theorem 2.4. 

Proof of Theorem 2.4- By assumption WnAn converges to W in the cut metric, and therefore by 
[8, Proposition C5 and Proposition C15], we have lim^^oo = oo. Thus applying Corollary 1.2, 
we note that it suffices to show 

lim — sup M;[’^(q) = sup p). (5.10) 

n^oo n qe(P[g])" peFP,^ 

To this end, setting pr{x) = c\i{r) for (^, for each 1 < r < g, 1 < i < n, we note that 

= (5.11) 

n 

Since nW a^ converges to W in the cut metric we have 

sup |F-^’'‘(WnA„,p)-E-^’'‘(lE,p)|^0. (5.12) 

peFPq 

This implies that 

limsup- sup M;(’^(q) < sup F‘^’^(W,p). 

rn-oo n qg(-p[q])n pSFP^ 

Thus to establish (5.10), we need to prove the other side of the inequality. Turning to prove the 
same, we note that it suffices to show that given any p S FPg there exists p^""^ G FPgj with pr 
being constant on (^, for 1 < i < n, 1 < r < g, such that 

lim =F-^'^{W,p). 

n^oo 


(5.13) 
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Indeed, using (5.11), we deduce that, for any p S FPg 
F^'^{W,p) < ' 


1 


n 


Next taking a supremum over q G F{[q])"', followed by a liminf on the both sides, and using (5.12), 
and (5.13), we further obtain that 

F‘^’'^(lF,p) < liminf- sup M;(’'‘(q). 

n^oo nqgp([g])n 


Next taking another supremum over p G FPg, we complete the proof of (5.10). 

Now it only remains to establish (5.13). A standard measure theoretic arguments yields the exis¬ 
tence G FPg, with pr being constant on (^, for 1 < z < n, 1 < r < g, such that 

lim max|p^”)(x) — Pr{x)\ = 0, Lebesgue almost surely. (5-14) 

n—>-cx) r=l 


Therefore, noting ||1T||^ < oo, using dominated convergence theorem, and the fact that the function 
X i-A xlogx is continuous on [0,1] we prove (5.13). This completes the proof of the theorem. □ 
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